A Naïve Bayes classifier for Shakespeare's second-person pronoun
نویسنده
چکیده
In order to investigate in explicit detail the way that yand thpronouns alternate in the Shakespearean corpus, I have undertaken a collocational analysis of the full corpus of Shakespeare’s 37 plays and found that (1) second-person pronouns can be disambiguated based on context alone, (2) ypronouns seem to be used in more formal situations or when an inferior is addressing a social better, and (3) the thpronoun is reserved for addressing peers, servants, or other familiar personages. Through the Python Natural Language Toolkit (Bird et al., 2009, Natural Language Processing with Python. Sebastopol, CA: O’Reilly Media), I implemented a Naı̈ve Bayes classifier that in effect treats each occurrence of a second-person pronoun as a black box that must be resolved into either a ypronoun or a thpronoun based only on the surrounding words. Using tenfold cross-validation, the classifier achieves an accuracy of 78.3% when fellow thand ypronouns are excluded from the context and 88.0% when we allow fellow thand ypronouns to assist in classification. Most interesting, however, are the context words that prove most informative in categorizing the pronouns. Significantly, the words most useful in classifying a pronoun as a ypronoun include high-register words such as lordship, madam, lords, and sir. After a group of conjugated second-person verbs like art and wert, the words most associated with thpronouns are words such as torment, nuncle, lesser, and villain. The ability to discriminate between forms based only on context confirms the hypothesis that the two classes of second-person pronoun are indeed used distinctly in the Shakespearean corpus. The list of words most helpful in making that distinction strongly suggests a difference in formality. We can also gain additional insight into the plays by examining some of the unexpected words that collocate with either one form or the other. .................................................................................................................................................................................
منابع مشابه
Person Independent Recognition Of Dynamic Hand Gestures Using Vision Based Systems
W hen a sign language recognition system will be used for teaching purposes, the students cannot train the system. Therefore person independent recognition has to be performed by using a pre-assembled gesture data set. Although many systems have been developed that train and recognize the same person, few research is done on person independent gesture recognition. In this paper we present three...
متن کاملA Sensor-Based Scheme for Activity Recognition in Smart Homes using Dempster-Shafer Theory of Evidence
This paper proposes a scheme for activity recognition in sensor based smart homes using Dempster-Shafer theory of evidence. In this work, opinion owners and their belief masses are constructed from sensors and employed in a single-layered inference architecture. The belief masses are calculated using beta probability distribution function. The frames of opinion owners are derived automatically ...
متن کاملPerformance Analysis of Privacy Preserving Naïve Bayes Classifiers for Distributed Databases
The problem of secure and fast distributed classification is an important one. The main focus of the paper is on privacy preserving distributed classification rule mining. This research paper addresses the performance analysis of privacy preserving Naïve Bayes classifiers for horizontal and vertical partitioned databases. The Naïve Bayes classifier is a simple but efficient baseline classifier....
متن کاملImage Classification Using Naïve Bayes Classifier
An image classification scheme using Naïve Bayes Classifier is proposed in this paper. The proposed Naive Bayes Classifier-based image classifier can be considered as the maximum a posteriori decision rule. The Naïve Bayes Classifier can produce very accurate classification results with a minimum training time when compared to conventional supervised or unsupervised learning algorithms. Compreh...
متن کاملBoosting the Tree Augmented Naïve Bayes Classifier
The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- LLC
دوره 27 شماره
صفحات -
تاریخ انتشار 2012